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Abstract — We consider the time correlated multiple-input 
single-output (MISO) broadcast channel where the transmitter 
has imperfect knowledge of the current channel state, in addition 
to delayed channel state information. By representing the quality 
of the current channel state information as P^" for the signal- 
to-noise ratio P and some constant a > 0, we characterize the 
optimal degree of freedom region for this more general two-user 
MISO broadcast correlated channel. The essential ingredients 
of the proposed scheme lie in the quantization and multicast 
of the overheard interferences, while broadcasting new private 
messages. Our proposed scheme smoothly bridges between the 
scheme recently proposed by Maddah-Ali and Tse with no 
current state information and a simple zero-forcing beamforming 
with perfect current state information. 



I. Introduction 

In most practical scenarios, perfect channel state information 
at transmitter (CSIT) may not be available due to the time- 
varying nature of wireless channels as well as the limited 
resource for channel estimation. However, many wireless 
applications must guarantee high-data rate and reliable commu- 
nication in the presence of channel uncertainty. In this paper, 
we consider such a scenario in the context of the two-user 
multiple-input single-output (MISO) broadcast channel, where 
the transmitter equipped with m antennas (m > 2) wishes to 
send two private messages to two receivers each with a single 
antenna. The discrete time signal model is given by 



yt 

Zt 



Kxt 



■£t, (la) 

QtXt+uJt, (lb) 

for any time instant t, where ht,gt G C™^^ are the channel 
vectors for user 1 and user 2, respectively; et,u!t ^ ^c (0, 1) 
are normalized additive white Gaussian noises (AWGN) at the 
respective receivers; the input signal Xt is subject to the power 
constraint E(||a;t||^) < P, Vi. 

For the case of perfect CSIT, the optimal degrees of 
freedom (DoF) of this channel is two and achieved by linear 
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Strategies such as zero-forcing (ZF) beamforming. When 
the transmitter suffers from constant inaccuracy of channel 
estimation, it has been shown in |T| that the degrees of 
freedom per user is upper-bounded by |, whereas the highest 
known achievable DoF value, also conjectured to be optimal, 
is only i. It is also well known that the full multiplexing 
gain can be maintained under imperfect CSIT if the error in 
CSIT decreases as 0{P^^) as P grows Q. Moreover, for 
the case of the temporally correlated fading channel such 
that the transmitter can predict the current state with error 
decaying as 0{P^") for some constant a £ [0, 1], ZF can only 
achieve a fraction a of the optimal degrees of freedom |2|. This 
result somehow reveals the bottleneck of a family of precoding 
schemes relying only on instantaneous CSIT as the temporal 
correlation decreases (a -> 0). Recently, a breakthrough has 
been made in order to overcome this problem. In |3 1, Maddah- 
Ali and Tse showed a surprising result that even completely 
outdated CSIT can be very useful in terms of degrees of 
freedom, as long as it is accurate. For a system with m > 2 
antennas and two users, the proposed scheme in O, hereafter 
called MAT, achieves the multiplexing gain of | per user, 
irrespectively of the temporal correlation. The role of perfect 
delayed CSIT can be re-interpreted as a feedback of the past 
signal/interference heard by the receivers. This side information 
enables the transmitter to perform "retrospective" alignment 
in the space and time domain, as demonstrated in different 
multiuser network systems (see |4| and the references therein). 
Despite its DoF optimality, the MAT scheme is designed 
assuming the worst case scenario where the delayed channel 
feedback provides no information about the current channel 
state. This assumption is over pessimistic as most practical 
channels exhibit some form of temporal correlation. In fact, it 
readily follows that the selection strategy between ZF and MAT 
yields the degrees of freedom of maxja, |} for a € [0, 1]. 
For either quasi-static fading channel (a > 1) or very fast 
channels (a — > 0), a selection approach is reasonable. However, 
for intermediate ranges of temporal correlation (0 < a < 1), 
a fundamental question arises as to whether a better way of 
exploiting both delayed CSIT and current (imperfect) CSIT 
exists. Studying the DoF under such a CSIT assumption is of 
practical and theoretical interest. 

The main contributions of this work are summarized in 
the following. First, we establish an outer bound on the DoF 
region of the two-user broadcast channel with perfect delayed 
and imperfect current state information. To that end, we use 
two powerful tools: the genie-aided model and the extremal 
inequality ||5|, |I6|. Then, we propose a novel scheme that 



optimally combines the ZF spatial precoding, based on the 
imperfect current state information, and the MAT space-time 
alignment, based on the perfect past state information. The 
key of this scheme is the digital transmission of the overheard 
interference, which replaces the analog one initially considered 
in the MAT alignment |3|. The role of spatial precoding, 
exploiting current CSIT, is two-fold: 

• It enables to reduce the power of overheard interferences 
in the MAT alignment. This power reduction then saves, 
via source compression/quantization, the resource related 
to the transmission of the overheard interferences. 

• It allows for the parallel transmission of two private 
messages on top of the multicast of overheard interferences 
as common message. 

It will be shown that the proposed scheme achieves the upper 
bound of the symmetric DoF 
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given by the converse. To achieve the other corner points of 
the region, we show that delayed CSIT is not necessary and 
the optimal strategy is a combination of rate-splitting, spatial 
precoding with imperfect current CSI, and superposition coding. 
Specifically, we split one of the users' message into two parts 
and broadcast one part of it as common message. The other 
part and the message of the other user are then superimposed 
over the common message and broadcast with spatial precoding. 
As an extension to the main result, we derive the optimal DoF 
region of the same channel with common message. Another 
extension is the achievable DoF region when only imperfect 
delayed CSIT is available (e.g., due to limited feedback rates). 
Finally, in addition to the results on the optimal DoF region, 
we provide the exact achievable rate regions of the proposed 
schemes in the appendix. 

At the time of submission, a parallel independent work |7| 
was brought to our attention which also builds on our initial 
results reported in fSl. In |7|, the authors consider an i.i.d. 
fading model in which the transmitter knows perfectly the 
past channel states and imperfectly the current channel state. 
Their achievability proof coincides with our optimal scheme, 
while the outer bound is derived differently by establishing 
an equivalent compound channel. It is worth noting that the 
outer bound technique developed in |7| does not rely on any 
essential statistical equivalence of the two users' channel vector 
directions, which is stronger than both the original result of 
Il3l as well as the result in this work (that exploits the isotropic 
property of the estimation error). On the other hand, our model 
allows temporal correlations of the channel coefficients and is 
therefore stronger than both the original result [31 and |7| in 
that sense. Thus, while both Q and the current work generalize 
||3), neither subsumes the other. 

The rest of the paper is organized as follows. In Section [III 
after presenting the assumptions and some basic definitions of 
our model, we provide our main theorem on the optimal DoF 
region. The above contributions are then presented in order. 



Finally, we conclude the paper in Section VI Detailed proofs 
are deferred to the appendix. 



Throughout the paper, we will use the following notations. 
Matrix transpose, Hermitian transpose, inverse, and determinant 
are denoted by A^ , A^, A^^, and det {A), respectively, a;-'- 
is any nonzero vector such that x"x^ = 0. Logarithm is in 
base 2. Partial ordering of Hermitian matrices is denoted by > 
and <, i.e., A> B means A — B is positive semidefinite. We 
use ^a; to denote a projection matrix on the direction given 

by X, i.e., *^ = -— ^. 
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II. System Model and Main Results 

The signal model of this paper is defined by ( [Tal l ^"d ([lb]). 
For convenience, we provide the following definition. 

Definition 1 (channel states): The channel vectors ht and 
Qt are called the states of the channel at instant t. For simplicity, 
we also define the state matrix St as St = h G § where S 

l9t . 

is the set of all possible states. 

The assumptions on the knowledge of the channel states and 

the fading process are summarized as follows. 

Assumption 1 (perfect delayed and imperfect current CSI): 
At each time instant t, the transmitter knows the delayed 
channel states up to instant t — 1. In addition, the transmitter 
can somehow obtain an estimate St G § of the current channel 
state St, i.e., ht and gt are available to the transmitter with 

ht = ht + ht, 

gt^gt+ gt 

where the estimate ht (also gt) and estimation error ht (also 
gt) are uncorrected and both assumed to be zero mean with 
covariance (1 — a'^)l,n and cr^Im, respectively, with a^ < 1. 
The receivers know perfectly all states {St} and {St}- 

Assumption 2 (fading process): The processes {<S^t}, j'S't}, 
and thus {St} are stationary and ergodic. Moreover, for any 
time instant t, we assume the following: 

1) rank {St) = 2 with probability 1 andE(logdet {StS'^)) > 
— oo. 

2) We have the Markov chain 



{s'-\ s'-') ^ St ^ St 



(2) 



3) The estimation error is isotropic, i.e., the distributions 
of ht and gt conditional on St are invariant under 
unitary transformations. Furthermore, for any ct^ > 0, 
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Note that when {St} and {St} are independent Rayleigh 
fading processes with independent and identically dis- 
tributed (i.i.d.) entries, all the above assumptions are verified. 
Without loss of generality, we impUcitly assume that a^ > 
in the rest of the paper. The case with cr^ = corresponds 
to the case of perfect CSIT, in which the capacity region is 
already known. Then, we can introduce a parameter ap > 
as the power exponent of the estimation error 

log(a2) 



ap 



logP 



The parameter ap can be regarded as the quality of the current 
CSIT in the high SNR regime. Note that ap = corresponds 



to the case with no current CSIT at all, while ap — > oo 
corresponds to the case with perfect current CSIT. In addition, 
we assume that lim ap exists and define 

P-foo 

a — lim ap. 

P-i-oo 

Hereafter, we use a instead of ap, whenever no confusion 
is likely. In addition, since a > 1 implies that the estimation 
noise is negligible as compared to the AWGN and can be 
regarded as perfect from the DoF perspective, we assume 
implicitly that the value of a > 1 is truncated at 1 wherever 
applicable. Connections between the above model and practical 
time correlated models are highlighted in Section M 

Definition 2 (achievable degrees of freedom): A code for 
the two-user Gaussian MISO broadcast channel with delayed 
CSIT and imperfect current CSIT is defined as follows: 

• A sequence of encoders at time t is given by Ft : Wi x 
W2 X §*~^ X §* I — > C™ where the messages Wi and 
W2 are uniformly distributed over the message sets Wi 
and W2, respectively. 

• A decoder for user k is given by the mapping Wk ■ 

The DoF pair (^1,^2) is said to be achievable if there exists 
a code that simultaneously satisfies the reliability condition 

limsupPrjW^fc ^Wk} =0, 



dn. 



and has a pre-log factor of the rate 

log2|Wfc(n,P)| 



lim liminf 
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> di. 



1,2. 



n log2 P 

The union of all achievable DoF pairs is then called the optimal 
DoF region of the Gaussian MISO broadcast channel. 

The main result of this paper is stated below. 

Theorem 1: The optimal degrees of freedom region of 
the two-user Gaussian MISO broadcast channel with perfect 
delayed and imperfect current CSIT is characterized by 



di<l, 
d2<l, 
di + 2^2 < 2 + a, 
2di + d2 <2 + a. 



(3a) 
(3b) 
(3c) 
(3d) 



As shown in Fig. [T] the DoF region is a polygon characterized 
by the vertices: (0, 1), {a, 1), (^, ^), (1, a), (1, 0). Note 
that the region collapses to the MAT region (3) when the 
quality of current CSIT is poor (a —> 0), whereas it grows 
smoothly towards the DoF region with perfect CSIT when a 
increases. In the following sections, we start with the converse 
proof by establishing outer bounds. Then, we propose schemes 
that achieve the corner points of the region. 



III. Converse 

In this section, we establish the converse proof of the main 
result. Before going into the details, we would like to point 
out the essential elements of the upcoming proof: 

• Genie-aided model: construct a degraded broadcast chan- 
nel, as in (|3]- 



(a,l) perfect CSIT (l,l) 




Fig. 1 . DoF region of a two-user MISO cliannel with perfect delayed and 
imperfect current CSI at the transmitter The estimation error of the current 
state scales as P~". 



* Extremal inequality: bound the weighted difference of 
differential entropies |I5|. 

• Isotropic property of the channel uncertainty: tight upper 
bound on the pre-log factor. 

First, let us consider the genie-aided model where the genie 
provides the received signal {zj} of user 2 to user 1. This is a 
degraded broadcast channel X <-> {Y, Z) o Z. Therefore, we 
have the following upper bounds on the rates (i?i, R2): 



nRi < H{Wi) 

= i?(Tyi|5",S'") 

= /(M/i;r",Z"|^",^")+ne„ (4) 

< I{Wi;Y", Z", T4^2 I ^", ^") + ne„ 

= I{Wi;Y^,Z''\S",S^,W2) + nen 



ntr, 



= Y, I{Wi-Y,,Z, I Y'-\ Z'-\S^, 5", W2) + - 

n 

< J2 IiX.r,Y,, Z, I Y'-\Z'-\S^, S", W2) + ne„ (5) 

i=l 
n 

^Y.I{X,-Y,,Z,\Y'-\Z'-\S\S\W2) + nen (6) 



j=i 



= Y,{KY^.Z,\Y'-\,Z'-\S\S\W2) 

i=l 

-h{Y,,Z,\X,,Y'-\Z'-\S\S\W2))+nen 

n 

= Y,{h{Y^,Z, I T„ S,) - h{E,, a,)) + ntn 



< ^ h{Yi, Z^ I Ti, Si) + nen 



(7) 
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ni?2 < H{W2) 

n 



(8) 
(9) 
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Z'-\S\S\W2))+nt^ 
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Y'-\Z'-\S\S\W2))+ne^ (10) 



St) - h{Z^ \Ti,Si)) +nen 



(11) 



where we define T, = (r^-\ Z^-\ S"-\ S'% W2). Note that 
the above chains of inequalities follow closely Gallager's 
proof for the degraded broadcast channel |9| (also see ifim ). 
with the integration of the channel states. In particular, (|4]) 
and (|8]l are from Fano's inequality; (|5]l is from the data 
processing inequality; (|6| holds because the input X^ and the 
outputs {Yi,Zi) of the channel at instant i do not depend 
on the future states given the past and current states; (|9]) 
results from the same reasoning and the chain rule of mutual 
information; (|7]| is from the non-negativity of the differential 
entropy of unit-variance AWGN, i.e., h{Ei,Vli) > 0; ( fTO] ) 
holds since removing (resp. adding) conditions does not 
decrease (resp. increase) differential entropy. In the following, 
we would like to obtain an upper bound on i?i + 2i?2- From 
^ and ( fTTj i, we have 

n 

n{Ri + 2R2) < J2{2hiZ, I 5,;) + h{Y„ Z, \ T,, Si) 

i=l 

-2h{Z,\T,,S,))+3ne^. (12) 

Now, we can upper-bound each term in the above summation: 

2h{Z, I 5,) + h{Y,, Z, I T,, S,) - 2h{Z, \ T,, S,) 

< max {2h{Z, \ S^) 

+ h{Y,,Z,\T,,S,)~2h{Z,\T,,S,)) 

< max 2h{Zi \ Si) 

+ max (/i(y„ Z, I T„ S^) ~ 2h{Z, \ T,, S^)). (13) 
The first maximization can be upper-bounded as: 



f T, P. 



max 2h{Z^ | S",) < 2 Eg, ( max ft.(g- X^ + E^) 



Ti -r x^ I T^ 



^i\Gi=9i 



<2EG.(log(l + P||g.|P)) 
<21ogP + 0(l) 



(14) 



where, to get the first inequality, we put the maximization into 
the expectation; the second inequality is from the fact that 
Gaussian distribution maximizes differential entropy under the 
covariance constraint, that the logarithmic function is mono- 
tonically increasing, and that the following partial ordering 
holds Coy{Xi\gi) < Co\/{Xi) ^ PI; the last one is from 



Jensen's inequality. The second maximization in ([T3j can also 
be bounded, but in a slightly more involved way, as shown 
in ([T5]l-(|20l) on the top of next page. We get ( [T5] l by putting 
one of the maximizations into the expectation, which does 
not decrease the value; in ( [T6| ), we define A^^ = [Ei iliY; 
^n\ is obtained by splitting one maximization into two, one 
with the trace constraint and the other with the covariance 
constraint; ( [TS) is from the fact that with covariance constraint, 
Gaussian distribution maximizes the weighted difference of 
two differential entropies, given that i) Si is independent of 
X^ conditional on T, ^ {Y'-^, Z'-^, S'~^,S\ W2) due to the 
Markovian (|2]| and the fact that Xi is a function of the messages 
(Wi, 1^2)5 the past states S"*"^, and the estimates up to the 
current state 5*% and that ii) Y^ is a degraded version of (Fj, Zj); 
this is an application of the extremal inequality [5|, |6|; note 
that K^, ^ C is defined as the optimal covariance for the inner 
maximization; ([T9| holds because any K such that ^ K ^ C 
with tr (C) < P belongs to the set {K : K hO,tr (K) < P}, 
and that the whole term only depends on Si; the last inequality 
is from the fact that det (I + A) < (1 + aii)(l + 022) for any 

Lemma 1: For any given K > Q with eigenvalues Ai > 
• • • > A„, > 0, we have 



E 



5^l5^(log(l + h'tKhi)) < log(l + fh^W^Xi) + 0(1), (21) 



E5^l^/log(l + g'^Kg,)) > log(l + 2''a^X,) + 0(1), (22) 
with 
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S,\Si 



log 
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Proof: See Appendix [A] ■ 

It is worth noting that 7 is finite according to Assumption |2] 
Therefore, 2'' is a strictly positive and bounded value that can 
be regarded as constant as far as the DoF is concerned. From 
Lemma [T] we have 

E^^l^^ (log(l + h'^Kh,) - log(l + g'lKg,)) 

l + ||/idpAi 



< log 



<log 1 



l + 2'rcr2Ai 



27ct2 



o(i) 



0(1) 



(23) 



< ~log(a2) + log(2V2 + \\h,f) + 0(1) (24) 

where (|23]) is from the fact that log i±2| < iog(l + 1), V a, a; > 
0, 6 > 0. Note that the above upper bound does not depend on 
K. From (|20]i and (|24| and by noticing that a^ < 1, we have 



max (/i(y„ Z, I r„ S,) - 2h{Z, \ T„ S,)) 

< alogP + ¥.s^{\og{2-^ + \\Kf)) + 0{l) 

= alogP + 0(l). (25) 

From ( [T2] l, ([T4]l, (|25|, and by letting n — > 00, we have 

R1+2R2 < (2 + a)logP + 0(l), 



from which we obtain ( [3c| i by dividing both sides of the above 
inequality by log P and tending P — )■ 00. Similarly, from ( [TT] i 
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max {hiY„ Z, \ T,, S^) - 2h{Z, \ T,, S^)) 

< maxET, ( max {h{Y„ Z,\T, = T, S,) - 2h{Z, \T, = T, 5,)) 

= maxE^. ( max £5^,7, {HY,, Z,\T,^ T, S, = S,) - 2h{Z, \T, = T, S, - S,)) 

= maxEr, ( max E^,^. {h{S,X, + N,\T,^T)- 2h{g"^X, + E,\T, ^ T)) 

Pt^ ^Px^\t^ ' 



= maxEy. I max max E^ , o (h(S^X, + N, I T, = T) - 2h(g"X, +Ei\T,= T)) 

PTi ' \C:Ct0.tr(C)<P PXi\Ti-- i'«P»^ 'J 

Cov(X;|Ti)^C 



maxET- I max E, ,6 (logdet (I + S^K^S") - 21og(l + g-K.,g,)) 

Pt, \C:CbO,tr(C)<P '^'l''* ^ ' 

lax Eg ,5^(logdet(I + S,K5n - 2 \og{l + g^ Kg,))] 



< Ea I max 
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^ ^J^ff^Wp'^S.IS. (l°g(l + ^"^^^^^ - l°g(l + 9^Kg^)) 

K:K^O,ti-{K)<P ' 



(15) 

(16) 
(17) 

(18) 
(19) 
(20) 



and ( [T4] i, and by letting n -^ 00, we have 

i?2<l0gP + 0(l), 



from which the single user bound ( |3b| ) follows immediately. 
To obtain ( (3a| i and ( (3d] i, we can use the genie-aided model 
in which receiver 2 is helped by the genie and has perfect 
knowledge of yt- Due to the symmetry, the same reasoning as 
above can be applied by swapping the roles of receiver 1 and 
receiver 2. The converse part is thus completed. 

Remark 3.1: In a nutshell, the converse proof can be 
summarized as follows, in terms of the essential elements 
mentioned at the beginning of this section. First, the "degraded" 
property enables the use of the extremal inequality (cf ( [TTJ i and 
(fTS)). Then, the latter provides a closed-form upper bound given 
by the Gaussian distribution (cf. (|20]l). Finally, the isotropic 
property of the channel uncertainty is exploited only at the 
end of the proof, to bound the expectation of the logarithmic 
function (cf (|22])). 

IV. ACHIEVABILITY 

To show the achievability of the whole region, it is enough 
to show that all comer points in Fig. [T] are achievable. Note that 
the extreme points (1,0) and (0, 1) can be trivially achieved 
by serving only one of the users. The rest of the section 
is devoted to proving the achievability of (l,a), (a, 1), and 
{^^, ^^)- Since the DoF region does not depend on the 
number of transmit antennas to, Vm > 2, it is enough to 
prove the achievability for the case m = 2 which is assumed 
implicitly in this section. The exact achievable rate region 
from which the DoF can be derived in a more rigorous way is 
provided in the appendix. 



A. Achieving (l,a) and (a, 1) 

One of the key elements to achieve the three corner points 
is broadcasting with common message in the presence of 



imperfect current CSIT. The following result is crucial and 
will be repeatedly used in the proofs. 

Lemma 2 (broadcast channel with common message): 
Let {Rc,Rpi,Rp2) be the rate of common message, private 
message for user 1, and private message for user 2, respectively. 
Furthermore, we let (dc, dpi, dp2) be the corresponding DoF. 
Then, there exists a family of codes {Xc(P), Xpi(P), Xp2(P)}, 
such that 



1 — a, and dpi = d. 



p2 



are achievable simultaneously. 

A sketch of proof is as follows, with more details given in 
Appendix IB] Let us consider a single channel use with a 
superposition scheme: x = Xc + cCpi + a;p2 with precoding 

such that E{x^ix'^^) = ^^g± and E (a;p2a?p2) = ^"^h^- 
We set the power Pp ^ P" such that the private signals are 
drowned by the AWGN at the unintended receivers while 
remaining the level P" at the intended receivers. The power of 
the common signal is P^ = E (jjajcp) ^ P. The decoding is 
performed as follows. At each receiver, the common message 
is decoded first by treating the private signals as noise. The 
signal-to-interference-and-noise ratio (SINR) is approximately 
Pc/Pp ~ P^~°', from which the achievability of dc = 1 — a is 
shown. Then, each receiver decodes their own private messages, 
after removing the decoded common message. The SINR for 
the private message being approximately P", dp^ = a is thus 
achievable for user k, k — 1,2. 

From the above lemma, the achievability of (l,a) is 
straightforward. Let Vl^i and W2 be the messages for user 1 and 
user 2, respectively. Assuming that the DoF are respectively 
di and d2, we can split user I's message as T4^i = (W^io, W^ii) 
with the corresponding rate-splitting di = dio + dn. Then, 
(W"io, VKii, W2) are broadcast to both users with Wio as 
common message. According to Lemma l2J (Wio,Wii) and 
(W"io, W2) can be recovered by user 1 and user 2, respectively. 



as long as 



at the receivers are given by 



c^io < 1 ^ tt, dii < a, and d2 < a 

which impHes di = dio + du < 1 and d2 < a are achievable 
simultaneously. Similarly, (a, 1) can also be achieved by the 
same scheme with rate-splitting over user 2's message. 

The proposed scheme, hereafter referred to as rate- 
splitting (RS), achieves both corner points (l,a) and (a, 1) 
with only current CSIT and without delayed CSIT at all. A 
sum DoF of 1 + a is thus attained. The idea is closely related to 
the Han-Kobayashi scheme IfTTI for the two-user interference 
channel where each receiver can decode and then eliminate the 
common part of the interfering signal to achieve a higher rate. 
Therefore, the common message in our RS scheme is desirable 
for only one of the users but is decodable by both users. 

B. Achieving the symmetric corner point (^^, ^^) 

In the following, we show that exploiting both current and 
delayed CSIT, the symmetric corner point f^-^, ^^) ^^^ 
be achieved. It provides a sum DoF of ^ ^ -^ that is strictly 
larger than 1 + a for a < 1. Since this scheme builds on the 
MAT scheme, we briefly review it first. 

1) MAT alignment revisited: In the two-user MISO case, the 
original MAT is a three-slot scheme, described by the equations 

xi = u X2 = V xs = [giU + h2V oy 

2/1 = h'lu 2/2 = Kv ys = hl^{g1u + h^v) 
zi = g^u Z2 = g2V zg = 5*1 {g\u + h^^v) 

where Xt £ C™^^,yt,zt G C are the transmitted signal, 
received signals at user I and user 2, respectively, at time 
slot t; u,v e ({^rnxi gj-g useful signals to user 1 and 
user 2, respectively; for simplicity, we omit the noise in the 
received signals. The idea of the MAT scheme is to use 
delayed CSIT to align the mutual interference into a one- 
dimensional subspace (hlv for user 1 and g^u for user 2). And 
importantly, the interference is reduced without sacrificing the 
dimension of the useful signals. Specifically, a two-dimensional 
interference-free observation of u (resp. v) is obtained at 
receiver 1 (resp. receiver 2). 

Interestingly, the alignment can be done in a different manner. 



Xi = U + V 

2/1 = h1{u + v) 
zi ^ giiu + v) 



X2 = [h}{v or x:, = [glu {)]' 
2/2 = hl^h!{v 2/3 = K^g^u 

Z2 = g2iK'" 2:3 = glig^u 



In the first slot, the transmitter sends the private signals to 
both users by simply superposing them. In the second slot, 
the transmitter sends the interference overheard by receiver 1 
in the first slot. The role of this stage is two-fold: resolving 
interference for user 1 and reinforcing signal for user 2. In the 
third slot, the transmitter sends the interference overheard by 
user 2 to help both users the other way around. In summary, this 
variant of the MAT scheme consists of two phases: i) broadcast 
of the private signals, and ii) multicast of the overheard 
interferences. At the end of three time slots, the observations 
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For each user, the useful signal lies in a two-dimensional 
subspace while the interference is aligned in a one-dimensional 
subspace. It readily follows that this variant enables each user 
to achieve two degrees of freedom in the three-dimensional 
time space as for the original MAT scheme. Although the 
original and variant schemes are equivalent from the point of 
the space-time alignment, they differ conceptually in the way 
how the "order-two" symbols are delivered. More precisely, 
the variant spends two slots to deliver two separate symbols: 
the interferences overheard by user 1 and user 2, denoted by 

r]i = h"v and 772 = g^u, 

while the original MAT spends a single slot to deliver one 
symbol /i^t) + gjtt. 

2) Proposed scheme: Based on the above variant of the MAT 
alignment, we propose a new scheme that exploits optimally 
both the perfect delayed and imperfect current CSIT. Before 
proceeding further, we would like to highlight the main ideas 
as compared to the MAT alignment (Fig. |2|: 

• Spatial precoding and power allocation in the first slot: 

1 + (1 — a) = 2 — a instead of two streams are broadcast. 

• Digitizing the overheard interferences (?7i,?72) in approxi- 
mately 2(1 — a) logP bits. 

• Broadcasting the digitized interferences (?7i,?72) as com- 
mon message and two new private messages of a log P 
bits each, in the second and third slots. 

These ideas will be explored in the rest of the section whereafter 
the interpretation of Fig. [2] will become clear. Since only 
hi and gi are involved below, we drop the time indices for 
convenience. 

Spatial precoding and power allocation: As in the MAT 
alignment, we first superpose the two private signals as a; = 
u + V, except that u and v are precoded beforehand. The 
precoding is specified by the covariance matrices 

Q^ = E{uu") and Q^=E{vv") 

that may depend on the estimates of the current channel. The 
power constraint is respected by choosing Qu and Q^ such 
that tr (Qu) + tr (Q„) < P. In particular, we choose Qu and 
Qv in such a way that the power of the interferences 771 and 
?72 is reduced and scales as 0{P^~°'). To this end, 

• for user k, k — 1,2, we send two streams of messages 
{Wk,i,Wk,2) in two orthogonal directions: one perpen- 
dicular to the estimated channel of the unintended user. 
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Fig. 2. Overview of the main differences between the Maddah-Ali Tse alignment and the proposed scheme. 



while the other one aligned with it, i.e.. 









P2^g, 



• the transmit power in the estimated channel direction is 

such that P2 ^ P^~", whereas the transmit power in the 

orthogonal direction is Pi = P — P2 ^ P for any a < 1. 

With Qu and Q^ chosen as such, it is readily shown that, for 

a given channel realization h, the power of the interference 

seen by user 1 is 



'(1 



= h"Q^h 



P2^h)h 
P2h"^f,h 



< Pl\\h\\' + P2\\h\\' 

By averaging af^^ over h, we have 

0{P^-°') 



E«] 



(26) 



Due to the symmetry, defining af = Eu{\g"u\ ), we also 
have E(cr.2J = 0{P^-"). 

Digitizing the overheard interferences : As in the second 
phase of the MAT variant, we would like to convey the 
overheard interferences {h"v, g"u) to both receivers. However, 
unlike the original MAT scheme where these symbols are 
transmitted in an analog fashion, we quantize them and then 
transmit the digital version. The rationale behind this choice 
is as follows. With the precoding and power allocation as 
described above, the overheard interferences have a reduced 
power 0{P^^°'), without sacrificing too much received signal 
powerQ As a result, we should be able to compress the 
interferences, which in turn makes room for transmission of 
new symbols. The benefit can be significant when the current 
CSIT is nearly perfect. In this case, the analog transmission is 

'With no CSIT on the current channel, the only way to reduce the 
interference power is to reduce the transmit power, therefore the received 
signal power 



no longer suitable, due to the mismatch between the source 
(interference) power and available transmit power. Therefore, a 
good alternative is to quantize the interferences and to transmit 
the encoded symbols. The number of quantization bits depends 
naturally on the interference power that is related to the quality 
of the current channel state information. 

For simplicity, we suppose that 771 and 7/2 are quantized sep- 
arately. Furthermore, let us assume that an _R^^-bits quantizer 
is used for rjk, k = 1,2. Hence, we have 

Vk = fjk + Afc 

where 77^ and A^ are respectively the quantized value and the 
quantization noise with average distortion E (|Afcp) — Dk, 
k = 1,2. The index corresponding to 17 = (771, 772), represented 
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R„„ bits, is then multicast to both users. In 



in i?^ 

order not to incur a DoF loss with the quantization, we set 
the distortion to the noise level, i.e., Di = D2 = 1. With the 
above choices, we can upper-bound the quantization rate i?^ 



Rr, <E 



log 
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E log 



V2 



<log(E« 
< 2(1 -a) log P 






where the first inequality is from the rate-distortion theorem and 
the fact that Gaussian source is the hardest to compress ifTOl : 
the second inequality is from the concavity of the log function 
and Jensen's inequality; the last one is from ( |26l ). 

Multicasting digitized interferences and broadcasting new 
private messages: The next step is to communicate the 
digitized interferences (7)1,772), represented approximately in 
2(1 — a) log P bits, to both users. This information is broadcast 
as common message in two slots. Meanwhile, new private 
messages {Wi,3, ^2,3) and (W^i,4, W2,4) are sent to both users 
simultaneously in the second and third slots, respectively. The 
superposition is illustrated in Fig. |2] In the following, we let 
(dc, dpi, (ip2) denote the corresponding DoF per slot for the 
common message, private messages for user 1 and user 2, 
respectively. It is readily shown that d^ — 1 — a. 



Decoding: Each user first decodes the second and third 
slots, i.e., receiver k recovers {fji,fi2,Wk,3,Wk^4), k = 1,2. 
According to Lemma l2] and given that d^. = 1 — a, these 
messages can be decoded reliably as long as 

dpk < a, k = 1,2. 

Then, receiver 1 has the following equations 

y — h"u + r^i + e, 
fji = ?7i - Ai, 
f}2 = m- ^2= g"u ~ A2, 
from which an equivalent 2x2 MIMO channel is obtained 



~ A 






Su 



f Ai' 
-A2 



(27) 



where the noise b = [e + Ai — A2Y depends on the input 
signals in general. Similarly, receiver 2 has 

-Ai 



Z -fl2 



Sv 



A, 



In order to recover the messages Wmimo,! = (W^i.i, ^^1^2) 
encoded in u or Winimo,2 = (M/2,1, ^^2,2) encoded in v, each 
user performs conventional MIMO decoding of the above 
equivalent channel. Let i?mimo denote the achievable rate of 
the equivalent channel ( |27] i in bits per channel use and dmimo 
the corresponding DoF. We can lower-bound i?n,imo as follows: 

i?mimo=E(/([/;f |5=S)) 

= ¥.{I{SU:Y)) (28) 

= ¥.{h{SU) - h{SU \Y)) 
= ¥.{h{SU) - h{E + Ai, -A2 I Y)) 
>E{h{SU)-h{E + Ai,-A2)) (29) 

> E(logdet (5Q^5")) - log(l + Di) 

- log(i?2) (30) 

= E(logdet(Q„)) +E(logdet(5S")) 

-log(l + i?i)-log(7^2) 
= l0g(PiP2) + 0(1) 
= (2-a)log(P) + 0(l) 

where ( |28] l is from the fact that S is invertible almost surely and 
therefore the linear transformation is information-lossless; ( [29] ) 
holds since conditioning does not increase differential entropy; 
([30| follows because u is Gaussian, then by noticing that E + 
Ai and A2 are independent with the corresponding differential 
entropies maximized by Gaussian distribution. Finally, in three 
slots, user k, k — I, 2, can recover the messages {Wk.i, Wk,2) 
sent in the equivalent MIMO channel corresponding to the 
MAT alignment as well as two fresh messages {Wk,3, WkA)^ 
from which the average DoF per user per channel use is 

, _ cfmimo + 2dpfc _ 2-a + 2a _ 2 + a 

This concludes the achievability of the whole region given 
by ^ and Fig. [T] 

Remark 4.1: By removing the private messages, one can 
send the common message in a higher rate (corresponding 
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Fig. 3. Comparison of the achievable DoF between the proposed scheme 
and the zero-forcing and MAT alignment as a function of a. 



to dc — 1 instead of rfc = 1 — a) and thus shorten the 
communication (1 + 2(1 — a) slots instead of 3 slots). This is 
the original idea reported in ||8l that provides an achievable 
DoF of j^Z2a ■ Inspired by the gap between this DoF and the 
upper bound given by the converse 



versus 



2- a + 2a 



3 -2a 3 3-2a + 2a' 

a natural question arose: Can we convey 2a more symbols per 
user by extending the transmission by 2a channel uses, i.e., in 
total over three channel uses? It turned out that it is possible 
by exploiting the current CSI, according to Lemma [2] 

In Fig. [3] we compare the achievable DoF of different 
schemes. The TDMA (time sharing between single-user com- 
munications) requires neither the current nor the delayed CSIT 
and achieves a DoF of i. The ZF precoding only exploits 
the current CSIT with a DoF of a, while the MAT scheme 
only exploits the delayed CSIT with a DoF of |. The scheme 
"RS-i-ZF" (Rate-Splitting and ZF precoding) is from equally 
time sharing between the corner points (l,a) and (a, 1). It 
only exploits the current CSIT with a DoF of i±^. Note that 
when a is close to 0, the estimation of current CSIT is bad and 
therefore useless. In this case, the optimal scheme is the MAT 
alignment. On the other hand, when a > 1, the estimation is 
good and the interference at the receivers due to the imperfect 
estimation is below the noise level and thus can be neglected 
as far as the DoF is concerned. In this case, delayed CSIT 
is useless and even ZF with the estimated current CSIT is 
asymptotically optimal, achieving a DoF of 1 per user. Our 
result reveals that strictly larger DoF than max{|, a} can be 
obtained by exploiting both the imperfect current CSIT and 
the perfect delayed CSIT in an intermediate regime a e (0, 1). 

In the appendix, we provide the exact achievable rate region. 
Some examples of the achievable su m r ates with Rayleigh 
fading are shown in Fig. |4] and Fig. 5 ^ In Fig. |4] we plot 
the sum rate performance of our sum-DoF optimal scheme 
for different values of a. We observe that as the quality of 

-Note that the parameters are fixed according to the choices given in the 
appendix without optimization. 
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Fig. 4. The achievable ergodic sum-rate of the proposed scheme with Rayleigh 
fading, for a = 0, 0.2, . . . , 1. 
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Fig. 5. The achievable ergodic sum-rate of the proposed sum-DoF optimal 
scheme, rate-splitting scheme, TDMA, zero-forcing, and MAT alignment. We 
set a = 0.5. 



channel knowledge increases (a — > 1), the sum rate improves 
significantly with the sharper slope promised by the DoF result. 
Note that the performance with a = nearly corresponds 
to the sum rate achieved by MAT (cf. Fig. l5]l. In Fig. [5] we 
compare our sum-DoF optimal scheme with different strategies: 
MAT, ZF, TDMA, as well as "RSh-ZF" in terms of the ergodic 
sum rate for a — 0.5. For this quality of the current CSIT, 
ZF performs substantially worse than the others, achieving the 
pre-log of one. With the same value of DoF as ZF, the TDMA 
scheme performs much better than the ZF scheme, since full 
transmit power can be used without causing interference. Note 
that the current CSIT is exploited in the TDMA scheme in 
such a way that the signal is beamformed in the direction of 
the estimated channel. The sum rate with MAT, RS-i-ZF, and 



the proposed scheme increases with a slope of |, 
respectively, as expected from the DoF results. 



and 



V. Discussions 

A. DoF with common message 

The main result of this paper can be extended trivially to 
the case with common message. 

Corollary 1: Let {do,di,d2) be the degrees of freedom 
related to the common message, private message for user 1, 



and private message for user 2, respectively. Then, the optimal 
DoF region is characterized by 

do + di<l, (31a) 

dQ + d2<l, (31b) 

2do + di + 2d2<2 + a, (31c) 

2do + 2di+d2<2 + a. (31d) 

Proof: The converse follows the same lines as in the case 
without common message, presented in Section [III] To obtain 
( |3Tb] i and ([3Tc]i, we replace W2 by m = {Wq, W2) and R2 



by R2 — Ro + i?2 throughout Section III and carry out exactly 



the same steps. Then, pia| i and pid[ ) follow straightforwardly 
by interchanging the roles of user 1 and user 2 as well as the 
symmetry between the two users. 

Note that the region is a polyhedron and completely 
characterized by the vertices in terms of (do, ^1,^2)- 

. extreme points: (1,0,0), (0,1,0), (0,0,1), 

. private points: (0,1, a), (0,a, 1), (O, ^,^),and 

• mixed point: (1 — a, a, a) 
which are all achievable with the proposed scheme. Thus, the 
entire region is achievable by time sharing between the vertices. 



B. Imperfect delayed CSI: Limited feedback 

In most practical scenarios, delayed CSIT is obtained through 
feedback channel and the current state is then predicted based 
on the delayed CSIT. Due to various reasons, perfect delayed 
CSIT may not be available. For instance, the limited feedback 
rate may incur a distortion on the channel coefficients. In the 
following, we take a look at the impact of the imperfect delayed 
CSIT on the achievable DoF of the proposed scheme. 

First, let us assume that the channel state St-i is quantized 
before being sent back to the transmitter (and to the other 
receiver). The quantization model is 

Sf-i — St-i + St-i 

where each entry of the quantization noise St-i has the same 
variance a^. We introduce a parameter f3 to characterize the 
precision of the quantization. As the definition of a, we define 
(3 as the power exponent of the quantization noisa^ i.e.. 



P- 



log ct|| 



FB 



logF 



Due to the lack of perfect delayed CSIT, instead of using 
5*^^ to predict St for the precoding and using St-i to perform 
the MAT alignment, the transmitter now predicts the quantized 
state St with the past quantized state S*^^ and uses St-i for 
the alignment. Therefore, although the actual interference seen 
by the receivers is {h"v,g"u), the transmitter only has access 
to a noisy version of it rj = {h"v,g"u). Receiver 1 has the 
following equations 

y = h"u + h"v + £ = h"u + r]i + {h - h)"v + e, (32) 
fji =i]i - Ai, 
fi2 = V2 - A2 = g"u - A2. 

^From the rate-distortion function, it is not difficult to relate /3 to the resource 
required for the CSI feedback, i.e., the feedback DoF. 
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Fig. 5. Impact of imperfect delayed CSIT on the achievable DoF with the 
proposed scheme. We fix a = 0.5 and vary /3 from 1 to 0. 



The power of r] is h"Q^h + g"Qug that depends on the 
"precision" of the prediction from S*^^ to St. It can be 
showrrl that the power exponent of this prediction error is 
a' = min{Q;,/3} where a is the power exponent of the 
prediction error when perfect delayed CSIT is present, i.e., 
predicting St from S*^^. Therefore, the achievable DoF of the 
proposed scheme would be ^^^ without taking into account 
the "residual interference" {h — h)"v in ( |32| ). In fact, this 
interference costs a DoF loss of 1 — /? over three slots, yielding 
the new DoF per user 

2 + a' -{1-13) 



d{a,l3) 



1 + min {a, /3} 



a,/3e [0,1]. 



As in the case with perfect delayed CSIT, the DoF pairs 
{l,a') and (a', 1) are achievable without the MAT alignment. 
An example of the DoF region is shown in Fig. [6j where we 
fix the value a and vary /3 from 1 to 0. As shown in the 
figure, when /3 — I, the DoF region is unchanged. When /3 is 
reduced to ^'^, the symmetric DoF point can be achieved by 
time sharing between the two corner points (l,a) and (a, 1). 
Delayed CSIT is not beneficial any more with our scheme. 
As (3 continues to diminish to a, the symmetric DoF keeps 
dropping while the corner points remain still. At this point, 
using MAT alignment creates more interference than resolving 
it. When (3 goes below a, it becomes the dominating source of 
interference. The corner points become (l,/3) and {13,1). The 
above analysis reveals that even imperfect delayed CSIT can be 
beneficial with our scheme, as long as the feedback accuracy 
/3 is larger than ^^^. However, it is unclear whether this 

'*Without going into the details, we can see that the following Markov chain 
holds S*-^ •!-!> S*~^ -i^ St «> St. The prediction error from S*~^ to St 
is now the aggregation of two effects: the channel variation, characterized by 
P~°', and the quantization error due to limited feedback rate, characterized 
by P~P . Hence, we have the power exponent of the aggregated error a' = 
inin{Q, /9}. 



naive extension to the imperfect delayed CSIT case is optimal. 
Finding optimal schemes with imperfect delayed CSIT remains 
an open problem and is out of the scope of this paper 

C. Bandwidth-limited Doppler process 

The main result on the achievable DoF has been presented 
in terms of an artificial parameter a, denoting the speed of 
decay of the estimation error a^ ^ p-a jjj (jjg current CSIT. 
In this section, we provide an example showing the practical 
interpretation of this parameter Focusing on receiver 1 due to 
symmetry, we describe the fading process, channel estimation, 
and feedback scheme as follows: 

• The channel fading ht follows a Doppler process with 
power spectral density Sh{w). The channel coefficients 



yfcTf 



< 



are strictly band-limited to [—F, F] with F 

where v, fc,Tf, and c denote the mobile speed in m/sec, 

the carrier frequency in Hz, the slot duration in sec, the 

light speed in m/sec, respectively. 

The channel estimation is done at the receivers side 

with pilot-based downlink training. At slot t, receiver 

1 estimates ht based on a sequence of noisy observations 

{st = a/P/^t + J^t} up to t, where Vt ^ ^c(Oi I) is the 

AWGN. The estimate is denoted by ht with 

ht = ht + ht. 

Under this model, the estimation error vanishes as 
E{\\htf)^p-\ 

At the end of slot t, the noisy observation St is sent to 
the transmitter and receiver 2 over a noise-free channel. 
At slot t + 1, based on the noisy observation {sr} up to 
i, the transmitter and receiver 2 acquire the prediction 
ht+i of ht+i and estimation ht of ht. The corresponding 
prediction model is 



ht = ht + ht. 
From ||2l Lemma 1], we have E(||h,t 



P- 



(1-2F) 



In this channel with imperfect delayed CSIT, we can still 
apply the proposed scheme and analysis in exactly the same 
way as in the previous section with a = 1 — 2F and (3 = 1. 

D. Non-ergodic fading (delay-limited communications) 

The DoF results have been derived based on the ergodic rates. 
For non-ergodic fading processes, the DoF can be redefined 
in the same manner as the definition of multiplexing gain in 
[IT2I . This approach has been reported in [8|. Following the 
footsteps in [8], it can be shown that the non-ergodic DoF 
coincides with the ergodic DoF. 

VI. Conclusions 

A scheme achieving the optimal degrees of freedom region 
in a two-user MISO broadcast channel has been presented. The 
approach optimally exploits the combination of delayed channel 
feedback together with imperfect current CSIT. In practical 
scenarios, the current CSIT may be obtained from a prediction 
based on the delayed CSIT samples. When the quality of 
current CSIT is poor, the proposed scheme coincides with 



11 



the previously reported MAT space-time alignment, whereas 
as the current CSIT prediction quality becomes ideal, the 
scheme relies on standard linear precoding. In between these 
extremal regimes, the proposed strategy advocates interference 
quantization followed by feedback. Generalizations of the 
proposed study to the MIMO case, multi-user case, and 
imperfect delayed CSIT case remain challenging yet interesting 
open problems. 

Appendix 
A. Proof of Lemma U] 

First, we show ( |2T] l as follows. 



<E^_l^^(log(l + Ai||/i,|p)) 
<log{l + Xi\\h,\\^+ma^Xi) 
= \og{l + X,\\h,f)+\og(l 

<\og{l + X,\\h,f)+\og(l 



(33) 



ma^Xi 



\hi 



where ( [33) is from the concavity of the log function. 

Then, to derive ( p2j i, let us define ^p = V"gi and ip = V"gi 
with V being the unitary matrix containing the eigenvectors 
of K, i.e., K — Vdiag (Ai, . . . , Am) V". From the isotropic 
assumption, ip has the same distribution as Qi and is also 
isotropic. Since the distribution of the vector ip is invariant 
under unitary transformations, it follows that the distribution of 
each scalar ipi in ^p is invariant under complex scalar rotations. 
Thus, ipi^ I = 1, . . . , TO, can be represented by Aie^^'- where 
Ai = \ipi\ is independent of 9i that is uniformly distributed in 
[0, 2tt). We need the following lemma for the proof. 

Lemma 3: Let 6* be a random variable uniformly distributed 
in [0, 2tt). Then, we have 

E,(log(|i3 + Ae^''|2))-log(max{|A|^|i?|2}). 

Proof: Without loss of generality, we assume that both 
A and B have non-negative real values, since 6 is uniformly 
distributed in [0, 27r). The expectation Eg{\og(\B + Ae^^l"^)) 
can be directly calculated as follows: 

Ee{\ogi\B + Ae^'f)) 

= E(,(log(A2 + B^ + 2ABcos{e))) 

1 f^'" 
= — / log(A2 + B^ + 2AB cos(6'))d6' 
27r Jo 

A^ + B^ + J{A^ + B^Y-{2ABY 
= log ^^^—^ '- ^ '- (34) 

= log(max{|A|^|B|2}) 

where ( (34] i is from the identity 

/■^ a+ Jo? - P 

/ log(a + bcos{2TTt))dt = log ^- , Va > 6 > 0. 

Jo 2 



Now, we can finish the proof of ( |22] l as follows: 

E^^^^g^ilogil+g'^Kg,)) 



E^|5. 



log 1 






-Vj 



>E^^I^/log(Ai|Vii+V^in) + 
(E^^I^,(log(Ai|^i+^i|2)))^ 

(%i5.(iog(^ii^in); 

= (log(2V2Ai)) + 



> 



> 



(35) 

(36) 

(37) 

(38) 
(39) 



> log(l + 2'^(t2Ai) - 1 

where in ( (35| ), (x)+ means max{a;,0}; ([36| is from the fact 
that moving the maximization outside of the expectation does 
not increase the value; ( (37] l is obtained by using the fact that 
ipi is invariant under complex scalar rotations and by applying 
Lemma [3J (averaging over the phase of "^i); in ( |38| ), we define 



o.l'AilM _ 



ls..il^ 



7 = E^^i^^^log^j = E^^i^ (log^^j with 7 > 
according to Assumption |2j in ( |39] l, we apply the inequality 

(log(a:))+>log(l + a;)-l. 

B. Proof of Lemma [2] 

We describe the coding scheme in Lemma |2] as follows. 

• Channel codebooks Xc,Xpi,Xp2 of length n and 
sizes 2"^% 2"^pi, and 2"^?^, respectively. Entries 
of these codebooks are generated i.i.d. according to 
^IcCOjAc), ?^c(0,Api), and J^lc (0,Ap2), respectively, 
with Ac,Api,Ap2 >: being m x m matrices that can 
be assumed to be diagonal without loss of generality. 

• Time-varying linear precoders that only depend on the 
estimate of the current state: 

• Coding: The commom message denoted by W^ is coded 
in {ic.t}"=i G Xc, precoded, and then multicast to both 
users. Meanwhile, two private messages Wpi and Wp2 for 
user 1 and user 2, respectively, are coded in {■iip,t}JLi ^ 
Xpi and {vp^t}t'=i G 3^p2j respectively, precoded, and sent. 
The transmitted signal is 

Xt = Otic.t + StMp^t + TtVp^t, t^l,...,n. 

Then, we can get the following achievable rate region. 

Proposition 1: The achievable rate region of the two-user 
MISO broadcast channel with common message is the union 
of the rate triples [R^^ Rpi, Rp2) with 

h"Q,h \\ 

l + h"iQpi+Qp2)h) )' 

g"Qcg 



i?e^min<^E log 1 



E log 1 



i?pi^E log 1 

i?p2=E(l0g( 1 



i + g"iQpi + Qp2)g 

h"Qpih \\ 
l + h"Qp2hJ J' 

g^Qp2g \\ 
i + 9"Qpig r 
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over all policies 

Q{S) ^ {Qc, Qpi, Qp2 ^ : tr {Q, + Qpi + Qpa) < P} 

that only depend on the estimate of the channels S. 

Proof: The proof is straightforward. First, the common 
message is decoded by treating the private signals as noise. 
Then, after removing the decoded common signal, the private 
message is obtained by treating the interference as noises. 
The covariance matrices are such that Qc = fiAcfi", Qpi — 
HApiH", Qp2 — rAp2r". Further details are omitted. ■ 

Setting Q, - PI, Qpi - P'^9g±, and Qp2 - P"*/,^, 
Lemma |2] follows immediately. 



C. Achievable rate region of the sum-DoF optimal scheme 

Let us recall that the proposed scheme consists of two phases. 
In the following, we let rii and n2 denote the length of Phase 1 
and Phase 2, in channel uses, respectively. The main ingredients 
in Phase 1 are: 

• Codebook generation: 

- Channel codebooks X^ of length ni and size 

2niflm,m„,i^ X^ of length ni and size 2"i^»™°^ Entries 
of Xu and X^ are generated i.i.d. according to 
y^c (0, Au) and J^c (0, A^,), respectively. A„, A„ ^ 
are m X m diagonal matrices. 

- Source codebooks Ct of length ni and size 2"^^'''!, 
k = 1,2. Entries of Ci and 62 are generated i.i.d. 
according to >[c(0, 1 - Dk), Dk < 1, fc = 1, 2. 

• Time-varying linear precoders that only depend on the 
estimate of the current state: 

• Coding in Phase 1: The codewords {ut}"^^ and {vt}"li 
are selected from X^ and X^, according to Wmimo,i and 
W^mimo,2^ respectively. The transmitted signal is 



Xt =&tUt+^tVt, t=l,. 



.,ni. 



» Quantization of the interferences 771 and 772: At the 

end of Phase 1, the transmitter knows {(?7i,t,??2,t)}"=i 

with ?7i_t = h^vt - 3M'c(0,cr2^ J and 772,* = 9>t ~ 

3Nrc(0,(T^2 1)' for a given channel realization {ht,gt}^J^^. 

The codebook 6^, k = 1,2, is used to quantize the 

normalized source \ ^''■* > that is i.i.d. Kc (0, 1). The 

quantized outputs are represented in ni{R^-^ + i?^2) ^i'^^- 

In Phase 2, exactly the same codebooks and precoders as in 

Appendix IB] are used, except that the length of the codewords 

is 712 instead of 77. The quantized interferences, represented 

in 77i(i?^j + Rrj2) bits and denoted by Wc, is coded in 

{i;(;,t}"l^"^2 ^ '^c, precoded, and then multicast to both users. 

Meanwhile, two private messages Wpi and Wp2 for user 1 and 

2 are coded in {u.^tK^t^U ^ ^p^ ^nd {ipA^^n^ti G 3^p2, 

respectively, precoded, and sent. The transmitted signal is 



Xf — iltXc_t + — (Mp ( 



TttSp,*, 



i = 77i + 1, . . . ,77i + 7l2. 



For user k to recover its original messages (Wmimo,*;, Wpk) 
correctl}]^ when 77i, 772 — ?> 00, it is enough to 

• recover the message (Wc, Wpk), which is possible if 



"i(-Ri,i +^r,2) < "-2^, 



(40) 



and if the triple {Re, i?pi, i?p2) lies in the region defined 
in Proposition [T] 

• reconstruct {flk,t}^li, k — 1,2, with 

Vk.t = f}k,t + ^k,t, ^k,t ^ ^c(0, crj^^_^i?fe), 
which is possible if 

^r,, > log (^-i-) , fc = l,2; 

• then decode the message Wmimo.fc, which is possible if 

Rn>in.o,i<I{U;Y,fi,,fi2\S,S), (41) 

i?mimo,2 < /(F; Z, 771 , 772 I S", S") . 

Putting all pieces together, we obtain the rate region of the 
proposed scheme in the following. 

Proposition 2: Let {Rc,Rpi,Rp2) be defined as in Proposi- 
tion [T] and let us define the compression rate i?^^. and MIMO 
rate as 

-R'/fc-log^:-, fc = l,2, 

^k 

i?mimo,i = E(logdet (I + r>i5Q„5")), 

i?mimo,2 = E(l0gdet (I + r>25Q.5")), 



(42) 
(43) 



with 



Di = diag 



U9 = diaa 



1 I-D2 

l + h"Q„hDi' g"QugD2^ 

1- Di 1 



h"Q^hDi l + g"Q.agD2) 

Then, the achievable rate region of the proposed scheme is the 
union of the rate pairs {Ri, R2) with 

i^ci?,n...o,fc + (i^,.l+i?„,2)flpfe fc^l^2, (44) 

over all policies D{S) = {Di,D2 ■ < i)fe < 1} and 

Q'{S) ^ {Q„,Q.,Qc,Qpi,Qp2 h : 

tr(Q„ + Q.) < P, tr(Qe + Qpi + Qp2) < P} 

that only depend on the estimate of the channels. 
Proof: The average achievable rate for user k is 

"-l-Rmimo,fe + '^2-Rpfc 



Rk = 



77i +n2 
Til 



1 



n2 

771 



^c -Rmimo.fc + (^)),1 + Rj),2)Rpk 
Re + Rri,l + Rri,2 



^Note that the assumption on the ergodicity and the Markov chain |2j makes 
the single-letter representation of the rates possible. 
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where the last equality holds by choosing ni and n2 that 
equalize (pOk. To see (|42]), we write 



(45) 
(46) 



IiU;Y,fii,fi2\S^S,S = S) 

= /(t/;7)i) + /(c7;r,7}2|r7i) 

= I{U;Y,f,2\m) 

= I{U;Y-f,i,f,2\m) 

^I{U;h''U + Ai+E,fi2) (47) 

= I{U; h^'U + Ai+E, a^^,,g"U + E^,^) 

= \ogdet{I + DiSQ^,S") (48) 



where ( [45| l is from the chain rule of mutual information; ( pS) is 
from the fact that U is independent of rji ; ( |47j l holds because fji 
is independent of all the other terms. Since 772 = ^2 + A2 with 
m ~ ?^c(0,g"Q„g(l - ^2)) and A2 - :N-c(0,g"Q„g^2) 
being additive Gaussian noise, we can optimally "estimate" 772 
from 772 with a linear MMSE estimator and get the "backward 
channel" model 

7/2 — ^mmse 772 ^~ ^mmse 

where Ommse — 1 — D2 corresponds to the scaling of the linear 
MMSE estimation and the additive estimation noise Cmmse ^ 
^c{0, o,mmse g"Qug D2) IS independent of the "input" 772 of 
the estimator Thus, ( |48| ) follows as the mutual information of 
an equivalent Gaussian MIMO channel with Gaussian input, 
where Q„ = GAuS*^ and Q„ — $At,$". Note that in the 
right hand sides of the above equalities, we have omitted 
the conditioning on {S = S,S = S'} for convenience of 
presentation. Finally, ( [42] l follows from ( [4T] i and ( |48| l. Due to 
the symmetry, (|43| is straightforward. ■ 

Note that the optimization in ( |44j i is not trivial and is out 
of the scope of this paper. Instead of finding the exact rate, 
we focus on the symmetric degrees of freedom of the scheme 
with 77i = 2, by fixing the following parameters: 



Q« = 


^i*., + ^2 Pi P2 
2 » + 2 ^' ^" " 2 '^" ^ 2 '^' 


Qc 


= ^i, Qpi = f*,., Qp2-f*,., 


Di- 


= ^2 = (Pt2)-1 = p-(l-") 



where we recall that 9g 



gg" 



and * 



a" 



(49) 



*£., and *£. , are 



similarly defined; the power allocations {Pc,Pp) and {Pi,P2) 
are specified by 



Pn = aa- ^ , 



P2 = {i-a)-a' 



Pc=P- Pp, 

Pl = p- P2, 



with a^ = niax{P ^i"'^} and 

tion of the choices on the covariance matrices has already been 



- ~T^- The interpreta- 



given in Section IV-B2 For the choices of the distortions d49^ 



and the power allocations, the intuitions are as follows: 

• The distortions Di and D2 are such that the errors {Ak.t} 
after the reconstruction of 771 and 772 are at the noise level. 

• The transmit power of the private signals scales as Pp ^ 
P", while the received power at the unintended receiver 



scales as P*', i.e., the noise level. Thus, the private signal 
does not incur any DoF loss for the unintended receiver. 



The scaling factor a ensures that Pp — P and P^ 







when the estimation error is small, i.e., a^ < P^^ while 
leading to Pp = and P^ — P when the estimation error 
is high, i.e., a^ = 1. Similarly, with (1 — a). Pi — P 
and P2 = when the estimation error is small, while 
Pi = P2 — ^ when the estimation error is high. 

It is readily shown that, with these choices, we have the high 

SNR approximation of the rates 

P<^ = (l-a)logP + 0(l), 
Ppfc = alogP + 0(l), k = l,2, 
Rr, = 2(1 - a) log P + 0{1), 
i?mimo,fe = (2 - a) logP + 0(1), k = 1, 2, 

from which we derive the symmetric DoF d^ym = ^^• 
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